Machine Learning Framework for the Prediction of Alzheimer’s Disease Using Gene Expression Data Based on Efficient Gene Selection
نویسندگان
چکیده
In recent years, much research has focused on using machine learning (ML) for disease prediction based gene expression (GE) data. However, many diseases have received considerable attention, whereas some, including Alzheimer’s (AD), not, perhaps due to data shortage. The present work is intended fill this gap by introducing a symmetric framework predict AD from GE data, with the aim produce most accurate smallest number of genes. works in four stages after it receives training dataset: pre-processing, selection (GS), classification, and prediction. symmetry model manifested all its stages. pre-processing stage columns dataset are pre-processed identically. GS stage, same user-defined filter metrics invoked every individually, so wrapper metrics. classification ML models applied identically minimal set genes selected preceding stage. core proposed meticulous algorithm which we designed nominate eight subsets original provided dataset. Exploring subsets, selects best one describe AD, also subset. For credible results, calculates performance repeated stratified k-fold cross validation. To evaluate framework, used an 1157 cases 39,280 genes, obtained combining smaller public datasets. were split two partitions, 1000 training/testing, 10-fold CV 30 times, 157 From testing/training phase, identified only 1058 be relevant support vector (SVM) these final validation, that never seen SVM classifier. evaluation, evaluated classifier via six metrics, impressive values. Specifically, 0.97, 0.98, 0.945, 0.972, 0.975 sensitivity (recall), specificity, precision, kappa index, AUC, accuracy, respectively.
منابع مشابه
Prediction of blood cancer using leukemia gene expression data and sparsity-based gene selection methods
Background: DNA microarray is a useful technology that simultaneously assesses the expression of thousands of genes. It can be utilized for the detection of cancer types and cancer biomarkers. This study aimed to predict blood cancer using leukemia gene expression data and a robust ℓ2,p-norm sparsity-based gene selection method. Materials and Methods: In this descriptive study, the microarray ...
متن کاملFeature Selection and Classification of Microarray Gene Expression Data of Ovarian Carcinoma Patients using Weighted Voting Support Vector Machine
We can reach by DNA microarray gene expression to such wealth of information with thousands of variables (genes). Analysis of this information can show genetic reasons of disease and tumor differences. In this study we try to reduce high-dimensional data by statistical method to select valuable genes with high impact as biomarkers and then classify ovarian tumor based on gene expression data of...
متن کاملClassification and Biomarker Genes Selection for Cancer Gene Expression Data Using Random Forest
Background & objective: Microarray and next generation sequencing (NGS) data are the important sources to find helpful molecular patterns. Also, the great number of gene expression data increases the challenge of how to identify the biomarkers associated with cancer. The random forest (RF) is used to effectively analyze the problems of large-p and smal...
متن کاملOral Cancer Prediction Using Gene Expression Profiling and Machine Learning
Oral premalignant lesion (OPL) patients have a high risk of developing oral cancer. In this study we investigate using machine learning techniques with gene expression profiling to predict the possibility of oral cancer development in OPL patients. Four classification techniques were used: support vector machine (SVM), Regularized Least Squares (RLS), multi-layer perceptron (MLP) with back prop...
متن کاملUsing Rule-Based Machine Learning for Candidate Disease Gene Prioritization and Sample Classification of Cancer Gene Expression Data
Microarray data analysis has been shown to provide an effective tool for studying cancer and genetic diseases. Although classical machine learning techniques have successfully been applied to find informative genes and to predict class labels for new samples, common restrictions of microarray analysis such as small sample sizes, a large attribute space and high noise levels still limit its scie...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Symmetry
سال: 2022
ISSN: ['0865-4824', '2226-1877']
DOI: https://doi.org/10.3390/sym14030491